10 research outputs found

    A review on deep-learning-based cyberbullying detection

    Get PDF
    Bullying is described as an undesirable behavior by others that harms an individual physically, mentally, or socially. Cyberbullying is a virtual form (e.g., textual or image) of bullying or harassment, also known as online bullying. Cyberbullying detection is a pressing need in today’s world, as the prevalence of cyberbullying is continually growing, resulting in mental health issues. Conventional machine learning models were previously used to identify cyberbullying. However, current research demonstrates that deep learning surpasses traditional machine learning algorithms in identifying cyberbullying for several reasons, including handling extensive data, efficiently classifying text and images, extracting features automatically through hidden layers, and many others. This paper reviews the existing surveys and identifies the gaps in those studies. We also present a deep-learning-based defense ecosystem for cyberbullying detection, including data representation techniques and different deep-learning-based models and frameworks. We have critically analyzed the existing DL-based cyberbullying detection techniques and identified their significant contributions and the future research directions they have presented. We have also summarized the datasets being used, including the DL architecture being used and the tasks that are accomplished for each dataset. Finally, several challenges faced by the existing researchers and the open issues to be addressed in the future have been presented

    A Dynamic Weighted Tabular Method for Convolutional Neural Networks

    Full text link
    Traditional Machine Learning (ML) models like Support Vector Machine, Random Forest, and Logistic Regression are generally preferred for classification tasks on tabular datasets. Tabular data consists of rows and columns corresponding to instances and features, respectively. Past studies indicate that traditional classifiers often produce unsatisfactory results in complex tabular datasets. Hence, researchers attempt to use the powerful Convolutional Neural Networks (CNN) for tabular datasets. Recent studies propose several techniques like SuperTML, Conditional GAN (CTGAN), and Tabular Convolution (TAC) for applying Convolutional Neural Networks (CNN) on tabular data. These models outperform the traditional classifiers and substantially improve the performance on tabular data. This study introduces a novel technique, namely, Dynamic Weighted Tabular Method (DWTM), that uses feature weights dynamically based on statistical techniques to apply CNNs on tabular datasets. The method assigns weights dynamically to each feature based on their strength of associativity to the class labels. Each data point is converted into images and fed to a CNN model. The features are allocated image canvas space based on their weights. The DWTM is an improvement on the previously mentioned methods as it dynamically implements the entire experimental setting rather than using the static configuration provided in the previous methods. Furthermore, it uses the novel idea of using feature weights to create image canvas space. In this paper, the DWTM is applied to six benchmarked tabular datasets and it achieves outstanding performance (i.e., average accuracy = 95%) on all of them

    Data driven classification of opioid patients using machine learning - An investigation

    Get PDF
    The opioid crisis has led to an increased number of drug overdoses in recent years. Several approaches have been established to predict opioid prescription by health practitioners. However, due to the complex nature of the problem, the accuracy of such methods is not yet satisfactory. Dependable and reliable classification of opioid dependent patients from well-grounded data sources is essential. Majority of the previous studies do not focus on the users’ mental health association for opioid intake classification. These studies do not also employ the latest deep learning based techniques such as attention and knowledge distillation mechanism to find better insights. This paper investigates the opioid classification problem by using machine learning and deep learning based techniques. We used structured and unstructured data from the MIMIC-III database to identify intentional and unintentional intake of opioid drugs. We selected 455 patient instances and used traditional machine learning and deep learning to predict intentional and accidental users. We obtained 95 % and 64 % test accuracy to predict the intentional and accidental users from the structured and unstructured datasets, respectively. We also achieve a distilled knowledge based test accuracy of 76.44 % from the integrated above two models. Our research includes an ablation analysis and new insights related to opioid patients are extracted

    Data Driven Classification of Opioid Patients Using Machine Learning–An Investigation

    Get PDF
    The opioid crisis has led to an increased number of drug overdoses in recent years. Several approaches have been established to predict opioid prescription by health practitioners. However, due to the complex nature of the problem, the accuracy of such methods is not yet satisfactory. Dependable and reliable classification of opioid dependent patients from well-grounded data sources is essential. Majority of the previous studies do not focus on the users’ mental health association for opioid intake classification. These studies do not also employ the latest deep learning based techniques such as attention and knowledge distillation mechanism to find better insights. This paper investigates the opioid classification problem by using machine learning and deep learning based techniques. We used structured and unstructured data from the MIMIC-III database to identify intentional and unintentional intake of opioid drugs. We selected 455 patient instances and used traditional machine learning and deep learning to predict intentional and accidental users. We obtained 95% and 64% test accuracy to predict the intentional and accidental users from the structured and unstructured datasets, respectively. We also achieve a distilled knowledge based test accuracy of 76.44% from the integrated above two models. Our research includes an ablation analysis and new insights related to opioid patients are extracted

    A Dynamic Weighted Tabular Method for Convolutional Neural Networks

    No full text
    Traditional Machine Learning (ML) models are generally preferred for classification tasks on tabular datasets, which often produce unsatisfactory results in complex tabular datasets. Recent works, using Convolutional Neural Networks (CNN) with embedding techniques, outperform the traditional classifiers on tabular dataset. However, these embedding techniques fail to use an automated approach after analyzing the importance of the features in the dataset accurately. This study introduces a novel feature embedding technique named Dynamic Weighted Tabular Method (DWTM), which dynamically uses feature weights based on their strength of the correlations to the class labels during applying any CNN architectures on the tabular datasets. DWTM converts each data point into images and then feeds to a CNN architecture. It dynamically embeds the features of the tabular dataset based on their strength and assigns pixel positions to the appropriate features in the image canvas space instead of using any static configuration. In this paper, DWTM embedding method is applied over six benchmark tabular datasets independently by using three different CNN architectures (i.e., ResNet-18, DenseNet and InceptionV1) and an outstanding performance (an average accuracy of 98%) has obtained, which outperforms any traditional and CNN based classifiers as well

    Predicting Movie Genre Preferences from Personality and Values of Social Media Users

    No full text
    We propose a novel technique to predict a user’s movie genre preference from her psycholinguistic attributes obtained from user social media interactions. In particular, we build machine learning based classification models that take user tweets as input to derive her psychological attributes: personality and value scores, and gives her movie genre preference as output. We train these models using user tweets in Twitter, and her reviews and ratings of movies of different genres in Internet movie database (IMDb). We exploit a key concept of psychology, i.e., an individual’s personality and values may influence her choice in performing different actions in real life. We have investigated how personality and values independently and collectively influence a user preference on different movie genres. Our proposed model can be used for recommending movies to social media users

    Predicting Academic Performance: Analysis of Students’ Mental Health Condition from Social Media Interactions

    No full text
    Social media have become an indispensable part of peoples’ daily lives. Research suggests that interactions on social media partly exhibit individuals’ personality, sentiment, and behavior. In this study, we examine the association between students’ mental health and psychological attributes derived from social media interactions and academic performance. We build a classification model where students’ psychological attributes and mental health issues will be predicted from their social media interactions. Then, students’ academic performance will be identified from their predicted psychological attributes and mental health issues in the previous level. Firstly, we select samples by using judgmental sampling technique and collect the textual content from students’ Facebook news feeds. Then, we derive feature vectors using MPNet (Masked and Permuted Pre-training for Language Understanding), which is one of the latest pre-trained sentence transformer models. Secondly, we find two different levels of correlations: (i) users’ social media usage and their psychological attributes and mental health status and (ii) users’ psychological attributes and mental health status and their academic performance. Thirdly, we build a two-level hybrid model to predict academic performance (i.e., Grade Point Average (GPA)) from students’ Facebook posts: (1) from Facebook posts to mental health and psychological attributes using a regression model (SM-MP model) and (2) from psychological and mental attributes to the academic performance using a classifier model (MP-AP model). Later, we conduct an evaluation study by using real-life samples to validate the performance of the model and compare the performance with Baseline Models (i.e., Linguistic Inquiry and Word Count (LIWC) and Empath). Our model shows a strong performance with a microaverage f-score of 0.94 and an AUC-ROC score of 0.95. Finally, we build an ensemble model by combining both the psychological attributes and the mental health models and find that our combined model outperforms the independent models

    Identifying Insomnia From Social Media Posts: Psycholinguistic Analyses of User Tweets

    No full text
    BackgroundMany people suffer from insomnia, a sleep disorder characterized by difficulty falling and staying asleep during the night. As social media have become a ubiquitous platform to share users’ thoughts, opinions, activities, and preferences with their friends and acquaintances, the shared content across these platforms can be used to diagnose different health problems, including insomnia. Only a few recent studies have examined the prediction of insomnia from Twitter data, and we found research gaps in predicting insomnia from word usage patterns and correlations between users’ insomnia and their Big 5 personality traits as derived from social media interactions. ObjectiveThe purpose of this study is to build an insomnia prediction model from users’ psycholinguistic patterns, including the elements of word usage, semantics, and their Big 5 personality traits as derived from tweets. MethodsIn this paper, we exploited both psycholinguistic and personality traits derived from tweets to identify insomnia patients. First, we built psycholinguistic profiles of the users from their word choices and the semantic relationships between the words of their tweets. We then determined the relationship between a users’ personality traits and insomnia. Finally, we built a double-weighted ensemble classification model to predict insomnia from both psycholinguistic and personality traits as derived from user tweets. ResultsOur classification model showed strong prediction potential (78.8%) to predict insomnia from tweets. As insomniacs are generally ill-tempered and feel more stress and mental exhaustion, we observed significant correlations of certain word usage patterns among them. They tend to use negative words (eg, “no,” “not,” “never”). Some people frequently use swear words (eg, “damn,” “piss,” “fuck”) with strong temperament. They also use anxious (eg, “worried,” “fearful,” “nervous”) and sad (eg, “crying,” “grief,” “sad”) words in their tweets. We also found that the users with high neuroticism and conscientiousness scores for the Big 5 personality traits likely have strong correlations with insomnia. Additionally, we observed that users with high conscientiousness scores have strong correlations with insomnia patterns, while negative correlation between extraversion and insomnia was also found. ConclusionsOur model can help predict insomnia from users’ social media interactions. Thus, incorporating our model into a software system can help family members detect insomnia problems in individuals before they become worse. The software system can also help doctors to diagnose possible insomnia in patients

    Deep learning-based analysis of COVID-19 X-ray images: Incorporating clinical significance and assessing misinterpretation

    No full text
    COVID-19, pneumonia, and tuberculosis have had a significant effect on recent global health. Since 2019, COVID-19 has been a major factor underlying the increase in respiratory-related terminal illness. Early-stage interpretation and identification of these diseases from X-ray images is essential to aid medical specialists in diagnosis. In this study, (COV-X-net19) a convolutional neural network model is developed and customized with a soft attention mechanism to classify lung diseases into four classes: normal, COVID-19, pneumonia, and tuberculosis using chest X-ray images. Image preprocessing is carried out by adjusting optimal parameters to preprocess the images before undertaking training of the classification models. Moreover, the proposed model is optimized by experimenting with different architectural structures and hyperparameters to further boost performance. The performance of the proposed model is compared with eight state-of-the-art transfer learning models for a comparative evaluation. Results suggest that the COV-X-net19 outperforms other models with a testing accuracy of 95.19%, precision of 96.49% and F1-score of 95.13%. Another novel approach of this study is to find out the probable reason behind image misclassification by analyzing the handcrafted imaging features with statistical evaluation. A statistical analysis known as analysis of variance test is performed, to identify at which point the model can identify a class accurately, and at which point the model cannot identify the class. The potential features responsible for the misclassification are also found. Moreover, Random Forest Feature importance technique and Minimum Redundancy Maximum Relevance technique are also explored. The methods and findings of this study can benefit in the clinical perspective in early detection and enable a better understanding of the cause of misclassification

    A Review on Large Language Models: Architectures, Applications, Taxonomies, Open Issues and Challenges

    No full text
    Large Language Models (LLMs) recently demonstrated extraordinary capability, including natural language processing (NLP), language translation, text generation, question answering, etc. Moreover, LLMs are a new and essential part of computerized language processing, having the ability to understand complex verbal patterns and generate coherent and appropriate replies for the situation. Though this success of LLMs has prompted a substantial increase in research contributions, rapid growth has made it difficult to understand the overall impact of these improvements. Since a lot of new research on LLMs is coming out quickly, it is getting tough to get an overview of all of them in a short note. Consequently, the research community would benefit from a short but thorough review of the recent changes in this area. This article thoroughly overviews LLMs, including their history, architectures, transformers, resources, training methods, applications, impacts, challenges, etc. This paper begins by discussing the fundamental concepts of LLMs with its traditional pipeline of the LLM training phase. It then provides an overview of the existing works, the history of LLMs, their evolution over time, the architecture of transformers in LLMs, the different resources of LLMs, and the different training methods that have been used to train them. It also demonstrated the datasets utilized in the studies. After that, the paper discusses the wide range of applications of LLMs, including biomedical and healthcare, education, social, business, and agriculture. It also illustrates how LLMs create an impact on society and shape the future of AI and how they can be used to solve real-world problems. Then it also explores open issues and challenges to deploying LLMs in real-world aspects, including ethical issues, model biases, computing resources, interoperability, contextual constraints, privacy, security, etc. It also discusses methods to improve the robustness and controllability of LLMs. Finally, the study analyses the future of LLM research and issues that need to be overcome to make LLMs more impactful and reliable. However, this review paper aims to help practitioners, researchers, and experts thoroughly understand the evolution of LLMs, pre-trained architectures, applications, challenges, and future goals. Furthermore, it serves as a valuable reference for future development and application of LLM in numerous practical domains.</p
    corecore